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Abstract 

We propose a method to increase the capacity achieved 
by uniform prior in discrete memoryless channels 
(DMC) with high input cardinality. It consists in ap- 
propriately reducing the input set. Different design 
criteria of the input subset are discussed. We develop 
an efficient algorithm to solve this problem based on 
the maximization of the cut-off rate. The method is 
applied to a mono-bit transceiver MIMO system, and 
it is shown that the capacity can be approached within 
tenths of a dB by employing standard binary codes 
while avoiding the use of distribution shapers. 

1. INTRODUCTION 

The challenge with nonsymmetric and/or nonbi- 
nary channels is that the capacity-achieving probability 
distribution is not uniform 1,. In this case, distribu- 
tion shapers are needed to approach capacity, which 
results in very large block sizes [2J. This is of course 
impractical for channels with low complexity receiver 
or where the sender and receiver wish to communicate 
without substantial average delay. To avoid distribu- 
tion shapers, we require that all signals are used evenly. 
In [3J, it is shown that the degradation in using uni- 
form prior, instead of the capacity achieving distribu- 
tion, is worst for the Z-channel, and the amount of the 
degradation, for binary-input channels, is quite small. 
For general DMCs, and especially those with high in- 
put cardinality, we show in this paper that uniform 
capacity can be increased by using a reduced packing 
of symbols. To this end, we try to find the best sub- 
set X' from the original input set X so that all its 
symbols are distinguishable at the receiver and max- 
imally spaced. This may be a crucial approach espe- 
cially in large DMC channels, where the transmitter 



have somehow an access to the channel state informa- 
tion, by means of a feedback channel for example (or 
the channel is a priori known). We could also require 
the subset size \X'\ = K to be a power of 2 if a binary 
code is employed so that encoded bits can be directly 
mapped to the channel input symbols. 
Our paper is organized as follows. First we formulate 
the problem mathematically based on different crite- 
ria in Section [2] Then we solve the problem of the 
optimal subset search based on two different criteria, 
respectively in Section [3J and |U In Section [5] we ap- 
ply our method to a mono-bit multi-input multi-output 
(MIMO) channel and show its usefulness. Finally, we 
test the performance of the selected input subset when 
combined with an LDPC code under this kind of chan- 
nels in Section H3 

2. SYSTEM MODEL AND PROBLEM FOR- 
MULATION 

We consider a DMC with finite input alphabet X 
having the cardinality \X\ — M and finite output al- 
phabet y. We assume the input to the channel to be a 
random variable X and let P (x) be the channel input 
probability mass function (pmf) and P(y\x) the chan- 
nel law, i.e., the probability of receiving Y — y when 
sending X = x. As we have stated in the introduction, 
we require that the distribution P(x) have this form 



P(x) 
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i.e., it is a uniform distribution over a subset X' C X. 
Different criteria can be considered to find the best 
subset X' for given size \X'\ = K < M: 
a) Maximizing the mutual information I(X, Y) 



max log 2 K+-^YjY1 W lo §2 
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b) Minimizing the symbol error rate (SER) assuming 
ML decoding 



costs calculated for individual symbols (probability of 
misdetecting an x) 



— max P(y\x) 
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c) Maximizing the cut-off rate Rq 
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Note that in all these problems the subset size is as- 
sumed to be a priori fixed and higher than 2 , where 
C denotes the true capacity!]] Problem a) is the most 
interesting from the information theoretical point of 
view. However we were not able to find an efficient al- 
gorithm for solving this problem. Nevertheless, it turns 
out that all these criteria are very correlated, so that 
the solution of one problem is nearly-optimal for the 
others. Therefore, we consider in this work as alter- 
native the SER minimization b) and the cut-off rate 
maximization c) problems due to their more tractable 
structures. 

Throughout our paper, a.; denotes the i-th element of 
a vector a and Aij the element of a matrix A in row i 
and column j. The operators (-) T and tr(-) stand for 
transpose and trace of a matrix, respectively. Vectors 
and matrices are denoted by lower and upper case italic 
bold letters. Om and 1m stand for the M-length zero 
and all ones vector respectively. 

3. MINIMIZING THE SYMBOL ERROR 
RATE 

In this part, we look for the subset that minimizes 
the SER. Note that this optimization task is an NP 
hard problem and an exhaustive search becomes in- 
tractable for large DMC. In [3], a binary switching al- 
gorithm (BSA), previously used for index optimization 
in vector quantization, has been proposed to overcome 
the complexity problems of the bruteforce approach. 
This algorithm finds through systematic switch of sym- 
bols a local optimum on a given cost function. If the 
algorithm is executed several times with different ran- 
dom initializations, the global optimum may be found 
with high probability. The binary switching algorithm 
can be also used here to search for the optimal subset. 
The input of the binary switching algorithm is the ini- 
tial subset that is chosen randomly. The algorithm first 
generates an ordered list of the initially selected sym- 
bols, sorted according to the decreasing order of their 



1 Clearly the subset size have to be chosen properly; this as- 
pect will be discussed later. 
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(5) 

Then the algorithm tries to replace the symbol that 
has the highest cost with another symbol from the re- 
maining subset X\X' , which is selected such, that the 
decrease of the total cost due to the switch is as large as 
possible. If no switch can be found for the symbol with 
the highest cost, the symbol with the second-highest 
cost will be tried to be replaced next. Also, if the low- 
est total cost is lower than the initial cost, the switch- 
ing is selected and the iteration is continued, else we 
try the third one and so on. After an accepted switch, 
a new ordered list of symbols is generated, and the al- 
gorithm continues as described above until no further 
reduction of the total cost is possible. The BSA con- 
verges to a subset with a local optimal cost. To find 
the subset with global optimal cost, we can start the al- 
gorithm with different initializations several times and 
select the result with the lowest total cost. 

4. MAXIMIZING THE CUT-OFF RATE 

The cutoff rate Rq can be used for practical finite 
length block codes in discrete memoryless channels to 
upper-bound codeword error rates after maximum like- 
lihood decoding. Besides, it represents a lower bound 
on the channel capacity. Thus the maximization of 
the cutoff rate is essential to have good performance in 
practice. Problem (U) can be reformulated as 
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and the vector b is a binary vector consisting only 
of the elements "0" and "1". The input symbols are 
here numbered consecutively from 1 to M. The ones 
in vector b indicates the symbols included in the sub- 
set X'. The formulation © is a constrained binary 
quadratic minimization problem (constrained BQP), 
thus we have to do with an NP-hard combinatoric prob- 
lem. It can be interpreted as a two partitioning prob- 
lem with fixed partition size. The matrix coefficient 



Aij can be interpreted as the cost of selecting the in- 
put i and j into the subset X 1 . 

Now, we introduce the vector s = [si, • • • ,s n ] T , with 
n = M + 1, and relate it to b as follows 



(8) 



where the slack variable s n € {—1, 1}. This substitu- 
tion is used to symmetrize the problem, which is neces- 
sary for the later convex problem formulation^ Then, 
it can be shown that problem ^ is equivalent to 



s T Bs s.t. 



s n ■ l^s =K + 1, s n = 1, SiS n - sj = Vi, 



with 
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By means of the substitution S = ss T , where S is a 
positive semidefmite matrix (S y 0) of rank 1, prob- 
lem © can be rewritten into the matrix optimization 
problem 

S = argmintr(J3S') s.t. 

Sii — Si n Vi, S nn = 1, S ' iSjii = if + 1 (11) 

and rank(S') = 1. 

The program (jlip is not convex because of the rank- 
one constraint. Recently, semidefinite programming 
(SDP) has been shown to be a very promising approach 
to combinatorial optimization, where SDP serves as a 
tractable convex relaxation of NP-hard problems. In 
[5], for example, a quasi- maximum likelihood method 
based on Semi-Definite Programming (SDP) for lattice 
decoding is introduced. 

In order to obtain a tractable SDP relaxation of (fTTj) . 
we remove the rank-one restriction from the feasible set 



S = argmintr(BS') s.t. 



Sii — Si n Vi , S nn — 1 , S S n i — K + 1 . 



(12) 



Note that this optimization has a linear objective sub- 
ject to affine equalities and a linear matrix inequality. 
Such problems are known as SDP and can be efficiently 
solved in polynomial time [5] 

If the optimal solution of the SDP has rank one, then 



2 If s is optimal then also is — s. 

3 It is possible to solve SDP relaxations of boolean QPs for 
problems of fairly large size (approx. 500 vars with interior point, 
5000+ with special techniques). 



the relaxation is tight. Otherwise, some special tech- 
niques are required to convert the SD relaxation solu- 
tion to an approximate Boolean QP solution. A ran- 
domization method has been proposed for this conver- 
sion process 0. This is motivated via a probabilistic 
argument. For this, assume that rather than choos- 
ing the optimal s in a deterministic fashion, we want 
to find instead a probability distribution with covari- 
ance matrix S — E[s T s] that will yield good solutions 
on average. For symmetry reasons, we can always re- 
strict ourselves to distributions with zero mean. For 
the constraints, we may require that the solutions we 
generate fulfill the constraints on expectation. Maxi- 
mizing the expected value of the cost ([9]) , under average 
constraints yields the SDP relaxation presented in (fl2|) . 



Algorithm 1 Codcbook Selection Algorithm 



1: Initialization: n = M + 1 
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2: Solve the semi-definite problem: 

S = argmintr(.BS') s.t. 
s^o 

Snn = 1) S n i = K + 1 and Sn = Sin Vi 

i 

. T - 

Cholesky factorization: S = V V 
Randomization: 

for i = 1, . . . , A rand do 

Randomly generate a vector uniformly dis- 
tributed on a n-dimensional unit sphere. 

Compute s (l) = Vi. 

• t~{i)\ 

sign(s^ y ) 
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Quantize the K highest entries of \s\' , ■ ■ • , a n _ lJ 
to 1 and the others to 
11: end for 

12: Choose s = argmaxs^' T .Bs^ 

«w 

13: Take b = [so, • • • , s n -i] T as approximate solution 



Usually, to further improve the approximation qual- 
ity, the randomization is repeated a number of times, 
and the randomized solution yielding the largest objec- 
tive function value is chosen as the approximate solu- 
tion. This procedure is stated in Step 4 to 12 of Algo- 
rithm^ Often, this randomization method can achieve 
an accurate approximation with a modest number of 



randomizations. An other more simple approach con- 
sists in taking s as the eigenvector of S associated with 
its maximal eigenvalue and then simply performing the 
quantization procedure (step 8, 9, 10), which can also 
provide good solutions. 

5. APPLICATION TO COARSELY QUAN- 
TIZED MIMO 

As application, we consider a point to point mono- 
bit quantized MIMO system for high speed links jS], 
where the transmitter employs T antennas and the re- 
ceiver has N antennas. Fig. [T] shows the general form 
of a quantized MIMO system, where H E C^ 5 * 1, is the 
channel matrix, known at both the transmitter and 
the receiver. We assume each entry Xj of the source 
symbol x is drawn from a discrete QPSK modulation, 
so that the source alphabet X has a cardinality of 
\X\ = M = 4 T . The average energy of x is fixed to 
1, i.e., x K x = 1. The vector rj refers to uncorrelated 
zero-mean complex circular Gaussian noise with equal 
variance per dimension given by er^. The unquantized 
channel output r 6 C N is given by 



(13) 



the conditional probabilities on each receiver dimension 

N 



p{y\*)= n n p (vc,i\x) 

c£{R,I} i=l 
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(15) 



= II I[® {^PTr/tfyc,i[Hx} c ,„ 
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with $(x) = —h= f x e *2 dt is the cumulative normal 

v ' V27T J -°o 

distribution function. 

As example, we use a random generated MIMO channel 
matrix specified as: 
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The solid line in Fig. [5] shows the true capacity of 



where Pr r is the transmit power. 
In this system, the real parts r^jj and the imaginary 
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Figure 1: One-bit Quantized MIMO System 



parts rij of the receive signals 7"j, 1 < i < N, are each 
quantized by a 1-bit resolution quantizer. Thus, the 
resulting quantized signals read as: 

y itC = sign(r,, c ) e {-1, 1}, for c € {R, I}, 1 < i < N. 

(14) 

Obviously the scalar (complex) quantization of the out- 
put of the QPSK MIMO channel with hard decision re- 
ceivers produces an equivalent channel with 4 T inputs 
and 4^ outputs. The resulting channel can be seen as 
a large strongly non-symmetric Discrete Memoryless 
Channel (DMC) [5], and characterized by a transition 
probability matrix P(x\y). Since all of the real and 
imaginary components of the receiver noise r) are statis- 
tically independent with variance ct^/2, we can express 
each of the conditional probabilities as the product of 
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Figure 2: Capacity improvement of 4x4 QPSK MIMO 
with mono-bit receiver under K — 16 and K = 64 
selected symbols found by algorithm 1. 



this channel obtained by optimizing the input distri- 
bution using the Blahut-Arimoto algorithm [5]. The 
capacity achieved by the uniform prior over all sym- 
bols is also plotted (dashed line), where a considerable 
rate loss can be observed. Now, applying Algorithm [1] 
to this channel for two different values of K (K = 64 



and K = 16) leads to the marked solid curves. The 
semidcfinitc program in the SDP was solved using the 
SeDuMi package [6]. Although the selection is based 
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Figure 3: Capacity improvement of 4x4 QPSK MIMO 
with mono-bit receiver under K = 16 and K = 64 
selected symbols using the SER minimization solved 
by the BSA. 

on the cut-off criterion, the resulting subsets almost 
achieve the capacity on a quite large SNR interval. It 
seems that the two subset sizes are sufficient to cover 
a wide dynamic range of the SNR. Besides it turns out 
that the optimal subset doesn't depend strongly on the 
SNR. This shows the usefulness of this approach. 
Figure |3] shows the capacity results obtained for the 
codebooks selected based on the SER criterion and the 
BSA as described in Section [3] under the same settings. 
Obviously the results are very similar to those in Fig 02 
which confirms our previous hypothesis that the selec- 
tion does not depend strongly on the chosen criteria. 
We note that the convergence time of the BSA depends 
on the channel conditions and the noise level and it may 
become useless for larger DMC. All in all, it is prefer- 
able to employ algorithm 1 rather than the BSA, since 
its convergence time is fixed and only polynomial in the 
size of X. 

6. PERFORMANCE WITH CODING 

Approaching the channel capacity of coarsely quan- 
tized MIMO systems is however not straight forward. 
Figure 0] shows the bit error ratio obtained with an 
ensemble of randomly generated LDPC code of length 
n = 250 applied on the same channel as in the previ- 



ous section. The parity check matrices were generated 
following [10] , The performance of our input set reduc- 
tion method with K = 32 compared to the full input 
use (K = 256) in terms of BER when combined with 
an LDPC code is shown in this figure. For both cases 
the total rate is R = 2.5 bits/channel use; and the 
rate of the LDPC code was adjusted for each case ac- 
cordingly. We apply a decoupled detection/decoding 
approach, where first the log-likelihood ratios 

/ PrjcM = lMn]j \ 
l0g U-Pr[c[i] = l|y[n]]J (18) 

are computed and then fed to the input of the belief- 
propagation algorithm. Here c[i] denotes the i-th bit 
that is output by the LDPC encoder, while y[n] is the 
n-th quantized received vector, where 

n = floor (i/log 2 K), (19) 

This comes about, since log 2 K code-bits are transfered 
per channel use, hence, for each received quantized vec- 
tor y[n] the log- likelihood ratios of log 2 K encoded bits 
are computed. Obviously the proper reduction of the 
input set improves the BER behavior significantly. Be- 
sides the full use of the input set cannot be handled 
gracefully, leading to a relatively large error floor. This 
is caused by the fact that with coarse channel output 
quantization, many different input symbols may be as- 
signed to the same output symbols at high SNR. To 
resolve this ambiguities small code rate and large block 
length would be necessary, which leads again to high 
latency time and complex receiver. Fortunately, reduc- 
ing the input set solves this problem in a simpler way. 
As we see in Fig. 21 the optimal constellation does not 
see any error floor and the receiver's task become easier 
with the more distinguishable selected symbols. 

7. CONCLUSION 

A method is proposed that allows approaching the 
true capacity of large DMC channels while using uni- 
formly distributed reduced input set. This has essential 
practical aspects since it allows the use of binary codes 
to approach the capacity without distribution shapers. 
In addition, the idea of reducing the input to symbols 
that are maximally spaced makes the task of the de- 
coder considerably easier and inherently includes some 
robustness against the quality of the channel state in- 
formation at the transmitter and other parameter fluc- 
tuation (SNR) in the system. To find the optimal in- 
put subset, we explored among others SDP relaxation 
techniques, that turns to be a very efficient approach 
providing excellent solutions for this problem. 




Figure 4: Bit error ratio of LDPC code after decoupled 
log-likelihood computation and belief-propagation al- 
gorithm. 
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